Serveur d'exploration sur la recherche en informatique en Lorraine

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Labelling logical structures of document images using a dynamic perceptive neural network

Identifieur interne : 002673 ( Main/Exploration ); précédent : 002672; suivant : 002674

Labelling logical structures of document images using a dynamic perceptive neural network

Auteurs : Yves Rangoni [France] ; Abdel Belaïd [France] ; Szilárd Vajda [Allemagne, États-Unis]

Source :

RBID : ISTEX:6D610809609F4620C8C9723AAA7C657BD30A6004

English descriptors

Abstract

Abstract: This paper proposes a new method for labelling the logical structures of document images. The system starts with digitised images of paper documents, performs a physical layout analysis, runs an OCR and finally exploits the OCR’s outputs to find the meaning of each block of text (i.e. assigns labels like “Title”, “Author”, etc.). The method is an extension of our previous work where a classifier, the perceptive neural network, has been developed to be an analogy of the human perception. We introduce in this connectionist model a temporal dimension by the use of a time-delay neural network with local representation. During the recognition stage, the system performs several recognition cycles and corrections, while keeping track and reusing the previous outputs. This dynamic classifier allows then a better handling of noise and segmentation errors. The experiments have been carried out on two datasets: the public MARG containing more than 1,500 front pages of scientific papers with four zones of interest and another one composed of documents from the Siggraph 2003 conference, where 21 logical structures have been identified. The error rate on MARG is less than 2.5% and 7.3% on the Siggraph dataset.

Url:
DOI: 10.1007/s10032-011-0151-y


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI wicri:istexFullTextTei="biblStruct">
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Labelling logical structures of document images using a dynamic perceptive neural network</title>
<author>
<name sortKey="Rangoni, Yves" sort="Rangoni, Yves" uniqKey="Rangoni Y" first="Yves" last="Rangoni">Yves Rangoni</name>
</author>
<author>
<name sortKey="Belaid, Abdel" sort="Belaid, Abdel" uniqKey="Belaid A" first="Abdel" last="Belaïd">Abdel Belaïd</name>
<affiliation>
<country>France</country>
<placeName>
<settlement type="city">Nancy</settlement>
<region type="region" nuts="2">Grand Est</region>
<region type="region" nuts="2">Lorraine (région)</region>
</placeName>
<orgName type="laboratoire" n="5">Laboratoire lorrain de recherche en informatique et ses applications</orgName>
<orgName type="university">Université de Lorraine</orgName>
<orgName type="institution">Centre national de la recherche scientifique</orgName>
<orgName type="institution">Institut national de recherche en informatique et en automatique</orgName>
</affiliation>
</author>
<author>
<name sortKey="Vajda, Szilard" sort="Vajda, Szilard" uniqKey="Vajda S" first="Szilárd" last="Vajda">Szilárd Vajda</name>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:6D610809609F4620C8C9723AAA7C657BD30A6004</idno>
<date when="2011" year="2011">2011</date>
<idno type="doi">10.1007/s10032-011-0151-y</idno>
<idno type="url">https://api.istex.fr/ark:/67375/VQC-RJG8K4HL-W/fulltext.pdf</idno>
<idno type="wicri:Area/Istex/Corpus">001948</idno>
<idno type="wicri:explorRef" wicri:stream="Istex" wicri:step="Corpus" wicri:corpus="ISTEX">001948</idno>
<idno type="wicri:Area/Istex/Curation">001929</idno>
<idno type="wicri:Area/Istex/Checkpoint">000573</idno>
<idno type="wicri:explorRef" wicri:stream="Istex" wicri:step="Checkpoint">000573</idno>
<idno type="wicri:doubleKey">1433-2833:2011:Rangoni Y:labelling:logical:structures</idno>
<idno type="wicri:source">HAL</idno>
<idno type="RBID">Hal:inria-00579825</idno>
<idno type="url">https://hal.inria.fr/inria-00579825</idno>
<idno type="wicri:Area/Hal/Corpus">002E16</idno>
<idno type="wicri:Area/Hal/Curation">002E16</idno>
<idno type="wicri:Area/Hal/Checkpoint">002040</idno>
<idno type="wicri:explorRef" wicri:stream="Hal" wicri:step="Checkpoint">002040</idno>
<idno type="wicri:doubleKey">1433-2833:2011:Rangoni Y:labelling:logical:structures</idno>
<idno type="wicri:Area/Main/Merge">002715</idno>
<idno type="wicri:Area/Main/Curation">002673</idno>
<idno type="wicri:Area/Main/Exploration">002673</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title level="a" type="main" xml:lang="en">Labelling logical structures of document images using a dynamic perceptive neural network</title>
<author>
<name sortKey="Rangoni, Yves" sort="Rangoni, Yves" uniqKey="Rangoni Y" first="Yves" last="Rangoni">Yves Rangoni</name>
<affiliation wicri:level="1">
<country xml:lang="fr">France</country>
<wicri:regionArea>Nancy 2 University, LORIA, Vandæuvre-Lès-Nancy</wicri:regionArea>
<wicri:noRegion>Vandæuvre-Lès-Nancy</wicri:noRegion>
<wicri:noRegion>Vandæuvre-Lès-Nancy</wicri:noRegion>
</affiliation>
<affiliation></affiliation>
</author>
<author>
<name sortKey="Belaid, Abdel" sort="Belaid, Abdel" uniqKey="Belaid A" first="Abdel" last="Belaïd">Abdel Belaïd</name>
<affiliation wicri:level="1">
<country xml:lang="fr">France</country>
<wicri:regionArea>Nancy 2 University, LORIA, Vandæuvre-Lès-Nancy</wicri:regionArea>
<wicri:noRegion>Vandæuvre-Lès-Nancy</wicri:noRegion>
<wicri:noRegion>Vandæuvre-Lès-Nancy</wicri:noRegion>
<placeName>
<settlement type="city">Nancy</settlement>
<region type="region" nuts="2">Grand Est</region>
<region type="region" nuts="2">Lorraine (région)</region>
</placeName>
<orgName type="laboratoire" n="5">Laboratoire lorrain de recherche en informatique et ses applications</orgName>
<orgName type="university">Université de Lorraine</orgName>
<orgName type="institution">Centre national de la recherche scientifique</orgName>
<orgName type="institution">Institut national de recherche en informatique et en automatique</orgName>
</affiliation>
<affiliation wicri:level="1">
<country wicri:rule="url">France</country>
<placeName>
<settlement type="city">Nancy</settlement>
<region type="region" nuts="2">Grand Est</region>
<region type="region" nuts="2">Lorraine (région)</region>
</placeName>
<orgName type="laboratoire" n="5">Laboratoire lorrain de recherche en informatique et ses applications</orgName>
<orgName type="university">Université de Lorraine</orgName>
<orgName type="institution">Centre national de la recherche scientifique</orgName>
<orgName type="institution">Institut national de recherche en informatique et en automatique</orgName>
</affiliation>
</author>
<author>
<name sortKey="Vajda, Szilard" sort="Vajda, Szilard" uniqKey="Vajda S" first="Szilárd" last="Vajda">Szilárd Vajda</name>
<affiliation wicri:level="3">
<country xml:lang="fr">Allemagne</country>
<wicri:regionArea>Computer Science Department, TU Dortmund, Dortmund</wicri:regionArea>
<placeName>
<region type="land" nuts="1">Rhénanie-du-Nord-Westphalie</region>
<region type="district" nuts="2">District d'Arnsberg</region>
<settlement type="city">Dortmund</settlement>
</placeName>
</affiliation>
<affiliation wicri:level="1">
<country wicri:rule="url">États-Unis</country>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series>
<title level="j">International Journal on Document Analysis and Recognition (IJDAR)</title>
<title level="j" type="abbrev">IJDAR</title>
<idno type="ISSN">1433-2833</idno>
<idno type="eISSN">1433-2825</idno>
<imprint>
<publisher>Springer-Verlag</publisher>
<pubPlace>Berlin/Heidelberg</pubPlace>
<date type="published" when="2012-03-01">2012-03-01</date>
<biblScope unit="volume">15</biblScope>
<biblScope unit="issue">1</biblScope>
<biblScope unit="page" from="45">45</biblScope>
<biblScope unit="page" to="55">55</biblScope>
</imprint>
<idno type="ISSN">1433-2833</idno>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt>
<idno type="ISSN">1433-2833</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>Document image analysis and recognition</term>
<term>Layout analysis</term>
<term>Logical labelling</term>
<term>Perceptive neural network</term>
<term>Time-delay neural network</term>
</keywords>
<keywords scheme="mix" xml:lang="en">
<term>Document image analysis and recognition</term>
<term>Layout analysis</term>
<term>Perceptive neural network</term>
<term>Time-delay neural network</term>
<term>logical labelling</term>
</keywords>
</textClass>
<langUsage>
<language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Abstract: This paper proposes a new method for labelling the logical structures of document images. The system starts with digitised images of paper documents, performs a physical layout analysis, runs an OCR and finally exploits the OCR’s outputs to find the meaning of each block of text (i.e. assigns labels like “Title”, “Author”, etc.). The method is an extension of our previous work where a classifier, the perceptive neural network, has been developed to be an analogy of the human perception. We introduce in this connectionist model a temporal dimension by the use of a time-delay neural network with local representation. During the recognition stage, the system performs several recognition cycles and corrections, while keeping track and reusing the previous outputs. This dynamic classifier allows then a better handling of noise and segmentation errors. The experiments have been carried out on two datasets: the public MARG containing more than 1,500 front pages of scientific papers with four zones of interest and another one composed of documents from the Siggraph 2003 conference, where 21 logical structures have been identified. The error rate on MARG is less than 2.5% and 7.3% on the Siggraph dataset.</div>
</front>
</TEI>
<affiliations>
<list>
<country>
<li>Allemagne</li>
<li>France</li>
<li>États-Unis</li>
</country>
<region>
<li>District d'Arnsberg</li>
<li>Grand Est</li>
<li>Lorraine (région)</li>
<li>Rhénanie-du-Nord-Westphalie</li>
</region>
<settlement>
<li>Dortmund</li>
<li>Nancy</li>
</settlement>
<orgName>
<li>Centre national de la recherche scientifique</li>
<li>Institut national de recherche en informatique et en automatique</li>
<li>Laboratoire lorrain de recherche en informatique et ses applications</li>
<li>Université de Lorraine</li>
</orgName>
</list>
<tree>
<country name="France">
<noRegion>
<name sortKey="Rangoni, Yves" sort="Rangoni, Yves" uniqKey="Rangoni Y" first="Yves" last="Rangoni">Yves Rangoni</name>
</noRegion>
<name sortKey="Belaid, Abdel" sort="Belaid, Abdel" uniqKey="Belaid A" first="Abdel" last="Belaïd">Abdel Belaïd</name>
<name sortKey="Belaid, Abdel" sort="Belaid, Abdel" uniqKey="Belaid A" first="Abdel" last="Belaïd">Abdel Belaïd</name>
</country>
<country name="Allemagne">
<region name="Rhénanie-du-Nord-Westphalie">
<name sortKey="Vajda, Szilard" sort="Vajda, Szilard" uniqKey="Vajda S" first="Szilárd" last="Vajda">Szilárd Vajda</name>
</region>
</country>
<country name="États-Unis">
<noRegion>
<name sortKey="Vajda, Szilard" sort="Vajda, Szilard" uniqKey="Vajda S" first="Szilárd" last="Vajda">Szilárd Vajda</name>
</noRegion>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Wicri/Lorraine/explor/InforLorV4/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 002673 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 002673 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Wicri/Lorraine
   |area=    InforLorV4
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     ISTEX:6D610809609F4620C8C9723AAA7C657BD30A6004
   |texte=   Labelling logical structures of document images using a dynamic perceptive neural network
}}

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Mon Jun 10 21:56:28 2019. Site generation: Fri Feb 25 15:29:27 2022